Efficient Barrier Using Remote Memory Operations on VIA-Based Clusters

نویسندگان

  • Rinku Gupta
  • Vinod Tipparaju
  • Jarek Nieplocha
  • Dhabaleswar K. Panda
چکیده

Most high performance scientific applications require efficient support for collective communication. Point-to-point message-passing communication in current generation clusters are based on Send/Recv communication model. Collective communication operations built on top of such point-to-point message-passing operations might achieve suboptimal performance. VIA and the emerging InfiniBand architecture support remote DMA operations, which allow data to be moved between the nodes with low overhead, they also allow to create and provide a logical shared memory address space across the nodes. In this paper, we focus on barrier, one of the frequently-used collective operations. We demonstrate how RDMA write operations can be used to support inter-node barrier in a cluster with SMP nodes. Combining this with a scheme to exploit shared memory within a SMP node, we develop a fast barrier algorithm for cluster of SMP nodes with cLAN VIA inteconnect. Compared to the current barrier algorithms using Send/Recv communication model, the new approach is shown to reduce barrier latency on a 64 processor (32 dual nodes) system by up to 66%. These results demonstrate that high performance and scalable barrier implementations can be delivered on current and next generation VIA/Infiniband-based clusters with RDMA support.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and Scalable Barrier Using RDMA and Multicast Mechanisms for InfiniBand-Based Clusters

This paper describes a methodology for efficiently implementing the collective operations, in this case the barrier, on clusters with the emerging InfiniBand Architecture (IBA). IBA provides hardware level support for the Remote Direct Memory Access (RDMA) message passing model as well as the multicast operation. Exploiting these features of InfiniBand to efficiently implement the barrier opera...

متن کامل

Efficient Collective Operations Using Remote Memory Operations on VIA-Based Clusters

High performance scientific applications require efficient and fast collective communication operations. Most collective communication operations have been built on top of point-to-point send/receive primitives. Modern user-level protocols such as VIA and the emerging InfiniBand architecture support remote DMA operations. These operations not only allow data to be moved between the nodes with l...

متن کامل

Fast Collective Operations Using Shared and Remote Memory Access Protocols on Clusters

This paper describes a novel methodology for implementing a common set of collective communication operations on clusters based on symmetric multiprocessor (SMP) nodes. Called Shared-Remote-Memory collectives, or SRM, our approach replaces the point-to-point message passing, traditionally used in implementation of collective message-passing operations, with a combination of shared and remote me...

متن کامل

Protocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters

The paper describes software architecture for supporting remote memory operations on clusters equipped with high-performance networks such as Myrinet and Giganet/Emulex cLAN. It presents protocols and strategies that bridge the gap between user-level API requirements and low-level networkspecific interfaces such as GM and VIA. In particular, the issues of memory registration, management of netw...

متن کامل

Efficient Support for Multicomputing on ATM Networks

The emergence of a new generation of networks will dramatically increase the attractiveness of loosely-coupled multicomputers based on workstation clusters. The key to achieving high performance in this environment is efficient network access, because the cost of remote access dictates the granularity of parallelism that can be supported. Thus, in addition to traditional distribution mechanisms...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002